In this notebook, a template is provided for you to implement your functionality in stages which is required to successfully complete this project. If additional code is required that cannot be included in the notebook, be sure that the Python code is successfully imported and included in your submission, if necessary. Sections that begin with 'Implementation' in the header indicate where you should begin your implementation for your project. Note that some sections of implementation are optional, and will be marked with 'Optional' in the header.
In addition to implementing code, there will be questions that you must answer which relate to the project and your implementation. Each section where you will answer a question is preceded by a 'Question' header. Carefully read each question and provide thorough answers in the following text boxes that begin with 'Answer:'. Your project submission will be evaluated based on your answers to each of the questions and the implementation you provide.
Note: Code and Markdown cells can be executed using the Shift + Enter keyboard shortcut. In addition, Markdown cells can be edited by typically double-clicking the cell to enter edit mode.
Visualize the German Traffic Signs Dataset. This is open ended, some suggestions include: plotting traffic signs images, plotting the count of each sign, etc. Be creative!
The pickled data is a dictionary with 4 key/value pairs:
import math
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
import pandas as pd
# import cv2
from sklearn.metrics import confusion_matrix
from sklearn.model_selection import train_test_split
import tensorflow as tf
%matplotlib inline
# Load pickled data
import pickle
# TODO: fill this in based on where you saved the training and testing data
training_file = "traffic-signs-data.zip/train.p"
testing_file = "traffic-signs-data.zip/test.p"
with open(training_file, mode='rb') as f:
train = pickle.load(f)
with open(testing_file, mode='rb') as f:
test = pickle.load(f)
X_train, y_train = train['features'], train['labels']
X_test, y_test = test['features'], test['labels']
train.keys()
### To start off let's do a basic data summary.
# TODO: number of training examples
n_train = X_train.shape[0]
# TODO: number of testing examples
n_test = X_test.shape[0]
# TODO: what's the shape of an image?
image_shape = X_train.shape[1:]
# TODO: how many classes are in the dataset
n_classes = np.unique(y_train).shape[0]
print("Number of training examples =", n_train)
print("Number of testing examples =", n_test)
print("Image data shape =", image_shape)
print("Number of classes =", n_classes)
### Data exploration visualization goes here.
### Feel free to use as many code cells as needed.
signnames = pd.read_csv("signnames.csv", index_col=["ClassId"])
# Plot 9 random images from train set
plt.figure(figsize=(14, 14))
for i in range(1, 10):
plt.subplot(3, 3, i)
img = np.random.randint(2, high=n_train)
plt.imshow(X_train[img, :, :, :])
plt.title(signnames.SignName.loc[y_train[img]])
# Plot 1st image from each class
plt.figure(figsize=(15, 5*math.ceil(n_classes/3)))
n_freq = list(zip(*np.unique(y_train, return_counts=True)))
for i, n_ in enumerate(n_freq):
n, freq = n_
img = np.argmax(y_train == n)
plt.subplot(math.ceil(n_classes/3), 3, i+1)
plt.imshow(X_train[img,:,:,:])
plt.title(signnames.SignName.loc[y_train[img]] + ", count: " + str(freq))
Design and implement a deep learning model that learns to recognize traffic signs. Train and test your model on the German Traffic Sign Dataset.
There are various aspects to consider when thinking about this problem:
Here is an example of a published baseline model on this problem. It's not required to be familiar with the approach used in the paper but, it's good practice to try to read papers like these.
Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.
### Preprocess the data here.
### Feel free to use as many code cells as needed.
def normalize_zero_one(x):
return x / 255 - 0.5
if X_train.max() > 1:
X_train = normalize_zero_one(X_train)
X_test = normalize_zero_one(X_test)
Describe the techniques used to preprocess the data.
Answer:
First I normalised the data by dividing by 255.
This will normalize as will scale all pixel values between 0 and 1.
Although this should be enough, I decided to extract 0.5 from the already scaled values.
This was done because I want, when using dropout the 0 to be in the middle of the range for all pixels. This is not neccessary because wx when x is 0 will always produce 0 and the gradient will also be 0 when using relus.
When experimenting for me x/255 - 0.5 showed best accuracy results.
I don't see any further preprocessing need, as: 3 channels of color deliver much more information for the model, than single grayscalse channel. E.g. red borders of signs are very discrimanative.
In the following class distribution figure, it is obvious that the distributions in test and train sets are similar. No need to balance classes, as when there aren't any significant features from X*w the model will learn higher bias towards the more frequent classes. This will improve accuracy if no other features could be extracted.
### Generate data additional (if you want to!)
### and split the data into training/validation/testing sets here.
### Feel free to use as many code cells as needed.
train_set, cv_set, train_labels, cv_labels = train_test_split(X_train, y_train)
plt.figure(figsize=(14,7))
plt.subplot(1,2,1)
n_freq = np.unique(y_train, return_counts=True)
plt.bar(n_freq[0], n_freq[1])
plt.title("Distribution of different type of signs - Train set")
plt.subplot(1,2,2)
n_freq = np.unique(y_test, return_counts=True)
plt.bar(n_freq[0], n_freq[1])
plt.title("Distribution of different type of signs - Test set")
Describe how you set up the training, validation and testing data for your model. If you generated additional data, why?
Answer:
I leave the official test set as a test set and split the original train set into two - a new train set and a cross validation set. I keep the default ratio of 0.75 : 0.25 From the graphs above is obvious that the inbalances of the classes are similar in the train and test set, so no need to balance the classes.
Achieved cross validation accuracy of 99.+ % makes me feel good about the model training and I don't think more generated data is needed. I as a human can effectively recognize about 20% of these blurry pictures.
### Define your architecture here.
### Feel free to use as many code cells as needed.
At 1500 > 1000 layers reached 100% accuracy on train set and 93.8% accuracy on cv - itteration 15000 for 128 batch size and 0.001 learning rate. Here I start regularizing. Add dropout(0.5) at input - low, low accuracy 3% - normalization - mean.
sess = None
n_train = train_set.shape[0]
n_cv = cv_set.shape[0]
print("Number of training examples =", n_train)
print("Number of cv examples =", n_cv)
print("Number of testing examples =", n_test)
print("Image data shape =", image_shape)
print("Number of classes =", n_classes)
n_input = image_shape[0] * image_shape[1] * image_shape[2]
n_hidden_layer_1 = 1500
n_hidden_layer_2 = 1000
# Store layers weight & bias
weights = {
'hidden_layer1': tf.Variable(tf.random_normal([n_input, n_hidden_layer_1])),
'hidden_layer2': tf.Variable(tf.random_normal([n_hidden_layer_1, n_hidden_layer_2])),
'out': tf.Variable(tf.random_normal([n_hidden_layer_2, n_classes]))
}
biases = {
'hidden_layer1': tf.Variable(tf.random_normal([n_hidden_layer_1])),
'hidden_layer2': tf.Variable(tf.random_normal([n_hidden_layer_2])),
'out': tf.Variable(tf.random_normal([n_classes]))
}
# tf Graph input
x = tf.placeholder(tf.float32, [None, 32, 32, 3])
y = tf.placeholder(tf.float32, [None, n_classes])
keep_prob = tf.placeholder(tf.float32)
x_flat = tf.reshape(x, [-1, n_input])
x_flat = tf.nn.dropout(x_flat, keep_prob)
# Hidden layer with RELU activation
layer_1 = tf.add(tf.matmul(x_flat, weights['hidden_layer1']), biases['hidden_layer1'])
layer_1 = tf.nn.relu(layer_1)
# Dropout
# layer_1 = tf.nn.dropout(layer_1, keep_prob)
layer_2 = tf.add(tf.matmul(layer_1, weights['hidden_layer2']), biases['hidden_layer2'])
layer_2 = tf.nn.relu(layer_2)
# Dropout
# layer_2 = tf.nn.dropout(layer_2, keep_prob)
# Output layer with linear activation
logits = tf.matmul(layer_2, weights['out']) + biases['out']
# Accuracy
correct_prediction = tf.equal(tf.arg_max(logits, 1), tf.arg_max(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits, y))
optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)
if sess:
sess.close()
sess = tf.Session()
init = tf.initialize_all_variables()
sess.run(init)
from sklearn.preprocessing import OneHotEncoder
encoder = OneHotEncoder(n_values=n_classes)
encoder.fit(train_labels.reshape((-1,1)))
train_labels_one_hot = encoder.transform(train_labels.reshape((-1,1))).todense()
cv_labels_one_hot = encoder.transform(cv_labels.reshape((-1,1))).todense()
test_labels_one_hot = encoder.transform(y_test.reshape((-1,1))).todense()
def batch(x, y, size=3, batches=3):
for i in range(batches):
idx = np.random.randint(0,x.shape[0], size)
yield x[idx], y[idx]
# for i, b in enumerate(batch(train_set, train_labels_one_hot)):
# x_batch, y_batch = b
# print(x_batch.shape)
# print(y_batch.shape)
# 0 0.0234375 0.0313169
# 1000 0.8125 0.759665
# 2000 0.859375 0.822095
# 3000 0.90625 0.864327
# 4000 0.960938 0.873712
# 5000 0.921875 0.891156
# 6000 0.945312 0.903397
# 7000 1.0 0.915026
# 8000 0.984375 0.903295
# 9000 0.96875 0.903703
# 10000 0.992188 0.903703
# 11000 0.953125 0.903499
# 12000 0.96875 0.925431
# 13000 0.960938 0.931042
# 14000 0.992188 0.935224
# 15000 1.0 0.938488
# 16000 0.984375 0.929511
# 17000 0.960938 0.944813
# 18000 0.953125 0.920126
# 19000 0.976562 0.943895
# 20000 0.992188 0.935836
# Following this tutorial: https://github.com/Hvass-Labs/TensorFlow-Tutorials/blob/master/02_Convolutional_Neural_Network.ipynb
def new_weights(shape):
return tf.Variable(tf.truncated_normal(shape, stddev=0.05))
def new_biases(length):
return tf.Variable(tf.constant(0.05, shape=[length]))
def new_conv_layer(input,
num_input_channels,
filter_size,
num_filters,
max_pool_stride=2, # Use 0 to skip max_pool
conv_stride=1):
shape = [filter_size, filter_size, num_input_channels, num_filters]
weights = new_weights(shape=shape)
biases = new_biases(length=num_filters)
layer = tf.nn.conv2d(input=input,
filter=weights,
strides=[1, conv_stride, conv_stride, 1],
padding='SAME')
layer += biases
if max_pool_stride > 0:
layer = tf.nn.max_pool(value=layer,
ksize=[1, max_pool_stride, max_pool_stride, 1],
strides=[1, max_pool_stride, max_pool_stride, 1],
padding='SAME')
layer = tf.nn.relu(layer)
# Note that ReLU is normally executed before the pooling,
# but since relu(max_pool(x)) == max_pool(relu(x)) we can
# save 75% of the relu-operations by max-pooling first.
return layer, weights
def flatten_layer(layer):
layer_shape = layer.get_shape()
# layer_shape == [num_images, img_height, img_width, num_channels]
num_features = layer_shape[1:4].num_elements()
layer_flat = tf.reshape(layer, [-1, num_features])
return layer_flat, num_features
def new_fc_layer(input,
num_inputs,
num_outputs,
use_relu=True):
weights = new_weights(shape=[num_inputs, num_outputs])
biases = new_biases(length=num_outputs)
layer = tf.matmul(input, weights) + biases
if use_relu:
layer = tf.nn.relu(layer)
return layer
filter_size1 = 5
num_filters1 = 32
filter_size2 = 5
num_filters2 = 64
fc_size = 512
n_train = train_set.shape[0]
n_cv = cv_set.shape[0]
print("Number of training examples =", n_train)
print("Number of cv examples =", n_cv)
print("Number of testing examples =", n_test)
print("Image data shape =", image_shape)
print("Number of classes =", n_classes)
# tf Graph input
x = tf.placeholder(tf.float32, [None, 32, 32, 3], name='x_image')
y = tf.placeholder(tf.float32, [None, n_classes], name='y_true')
keep_prob = tf.placeholder(tf.float32)
layer_conv1, weights_conv1 = new_conv_layer(input=x,
num_input_channels=3,
filter_size=filter_size1,
num_filters=num_filters1)
print('conv layer 1', layer_conv1.get_shape(), 'weights shape:', weights_conv1.get_shape())
layer_conv1 = tf.nn.dropout(layer_conv1, keep_prob)
layer_conv2, weights_conv2 = new_conv_layer(input=layer_conv1,
num_input_channels=num_filters1,
filter_size=filter_size2,
num_filters=num_filters2)
print('conv layer 2', layer_conv2.get_shape(), 'weights shape:', weights_conv2.get_shape())
layer_conv2 = tf.nn.dropout(layer_conv2, keep_prob)
layer_flat, num_features = flatten_layer(layer_conv2)
print('layer_flat', layer_flat.get_shape(), 'num_features:', num_features)
layer_fc1 = new_fc_layer(input=layer_flat,
num_inputs=num_features,
num_outputs=fc_size,
use_relu=True)
print('layer_fc1', layer_fc1.get_shape())
logits = new_fc_layer(input=layer_fc1,
num_inputs=fc_size,
num_outputs=n_classes,
use_relu=False)
print('logits', logits.get_shape())
y_pred = tf.nn.softmax(logits)
# Accuracy
correct_prediction = tf.equal(tf.arg_max(logits, 1), tf.arg_max(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits, y))
optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)
if sess:
sess.close()
sess = tf.Session()
init = tf.initialize_all_variables()
sess.run(init)
for i, b in enumerate(batch(train_set, train_labels_one_hot, size=256, batches=5001)):
x_batch, y_batch = b
_, batch_acc, batch_cost = sess.run([optimizer, accuracy, cost ],
feed_dict={x: x_batch, y: y_batch, keep_prob: 0.7})
if i % 500 == 0:
cv_acc, cv_cost = sess.run([accuracy, cost], feed_dict={x: cv_set, y: cv_labels_one_hot, keep_prob: 1.0})
print(i, "accuracy:", batch_acc, cv_acc, "cost:", batch_cost, cv_cost)
# update the model with the CV set (before test set)
for i, b in enumerate(batch(cv_set, cv_labels_one_hot, size=256, batches=1001)):
x_batch, y_batch = b
_, batch_acc, batch_cost = sess.run([optimizer, accuracy, cost ],
feed_dict={x: x_batch, y: y_batch, keep_prob: 0.7})
if i % 500 == 0:
cv_acc, cv_cost = sess.run([accuracy, cost], feed_dict={x: cv_set, y: cv_labels_one_hot, keep_prob: 1.0})
print(i, "accuracy:", batch_acc, cv_acc, "cost:", batch_cost, cv_cost)
test_acc, test_cost = sess.run([accuracy, cost], feed_dict={x: X_test, y: test_labels_one_hot, keep_prob: 1.0})
print("test accuracy:", test_acc, "cost:", test_cost)
# # no dropout:
# 0 accuracy: 0.015625 0.0241763 cost: 19291.8 17137.8
# 1000 accuracy: 0.796875 0.745588 cost: 275.525 489.308
# 2000 accuracy: 0.890625 0.827196 cost: 173.074 311.284
# 3000 accuracy: 0.921875 0.850556 cost: 65.481 261.992
# 4000 accuracy: 0.929688 0.878813 cost: 66.8335 198.663
# 5000 accuracy: 0.960938 0.89442 cost: 21.034 175.927
# 6000 accuracy: 0.945312 0.882995 cost: 18.4813 203.894
# 7000 accuracy: 0.914062 0.891258 cost: 61.8492 191.083
# 8000 accuracy: 0.953125 0.903601 cost: 32.2379 174.571
# 9000 accuracy: 0.984375 0.927063 cost: 14.4275 125.736
# 10000 accuracy: 0.960938 0.916046 cost: 34.4207 145.993
# 11000 accuracy: 0.992188 0.916454 cost: 1.96126 133.523
# 12000 accuracy: 0.976562 0.927777 cost: 32.4752 130.859
# 13000 accuracy: 0.984375 0.937876 cost: 15.2652 113.364
# 14000 accuracy: 0.976562 0.941753 cost: 25.3948 107.514
# 15000 accuracy: 0.992188 0.943793 cost: 1.24925 101.303
# 16000 accuracy: 0.976562 0.929307 cost: 8.563 128.648
# 17000 accuracy: 0.992188 0.939202 cost: 5.99585 120.153
# 18000 accuracy: 0.984375 0.940018 cost: 0.89362 109.291
# 19000 accuracy: 0.984375 0.930021 cost: 1.48128 134.176
# 20000 accuracy: 0.992188 0.951341 cost: 4.20262 91.0733
# In [ ]:
# drpoout input_x 0.9
# 0 accuracy: 0.0703125 0.0247883 cost: 26374.5 19833.0
# 1000 accuracy: 0.757812 0.74314 cost: 225.324 507.833
# 2000 accuracy: 0.851562 0.830664 cost: 214.398 281.104
# 3000 accuracy: 0.921875 0.835765 cost: 73.9734 314.112
# 4000 accuracy: 0.945312 0.888503 cost: 69.3543 177.6
# 5000 accuracy: 0.9375 0.896358 cost: 84.3492 170.276
# 6000 accuracy: 0.945312 0.878405 cost: 23.1133 211.742
# 7000 accuracy: 0.953125 0.902887 cost: 34.8693 180.337
# 8000 accuracy: 0.953125 0.914414 cost: 31.0314 156.178
# 9000 accuracy: 0.960938 0.920535 cost: 14.6948 152.378
# 10000 accuracy: 0.96875 0.926043 cost: 15.2959 133.041
# 11000 accuracy: 0.984375 0.927471 cost: 10.5089 132.867
# 12000 accuracy: 1.0 0.920331 cost: 0.0 159.027
# 13000 accuracy: 0.960938 0.91574 cost: 36.4833 162.349
# 14000 accuracy: 0.992188 0.937978 cost: 36.6903 122.351
# 15000 accuracy: 0.96875 0.93145 cost: 45.7581 133.481
# 16000 accuracy: 0.992188 0.948689 cost: 3.9844 100.509
# 17000 accuracy: 0.96875 0.925533 cost: 23.0183 133.56
# 18000 accuracy: 0.992188 0.946445 cost: 5.31872 99.019
# 19000 accuracy: 0.992188 0.947465 cost: 10.7802 107.175
# 20000 accuracy: 1.0 0.938182 cost: 0.0 125.835
# In [ ]:
# CNN: ~3k itterations, 128 then 256 batch size accuracy: 1.0 0.987147 cost: 0.00232341 0.0624209
# dropout 0.8:
# 4000 accuracy: 0.992188 0.989289 cost: 0.0369007 0.0515085
# 5000 accuracy: 1.0 0.989799 cost: 0.0172542 0.0469535
# 6000 accuracy: 0.992188 0.989901 cost: 0.0308556 0.0456895
# 6331 accuracy: 0.996094 0.992349 cost: 0.00970181 0.0378543
# 7300 accuracy: 0.996094 0.993267 cost: 0.0110829 0.0316904
sess.close()
What does your final architecture look like? (Type of model, layers, sizes, connectivity, etc.) For reference on how to build a deep neural network using TensorFlow, see Deep Neural Network in TensorFlow from the classroom.
Answer:
My final model is CNN with 2 convolutional layers with filter size 5, producting 32 filters in L1 and 64 filters in L2. Then I use 2 dense layers of 512 units each and finally a classification layer with softmax. For cost I use softmax cross entropy. For update use Adam with 0.0001 learning rate.
### Train your model here.
### I trained my final model in upper cell, the session is not closed
How did you train your model? (Type of optimizer, batch size, epochs, hyperparameters, etc.)
Answer: I used Adam optimizer batch size of 256. And performed 5k iterations on random batches. Then I updated the model for another 1k iterations using 256size batches with the CV data. The final dropout I used for training is 0.7.
What approach did you take in coming up with a solution to this problem?
Answer: I first experimented with FFNN with 2 layers and different sizes and dropout. Then experimented with CNN and observed better results on the CV set. Experimented with different hyper paramaters - number of layers, sizes of filters, number of filters for convolutions, size of fully-connected layers, dropout, etc.
How did I decide on the batch size: I experimented with batch sizes between 32 and 1024. Obviously my problem wasn't memory, so I could use as big batch size as I want. Finally I decide to use 256 as batch size, because it contains enough data for correct gradient estimation and aslo gives fast enough convergence of the model.
How did I decide how many epochs to use:
My model training is not based on epochs. It is based on batch itterations with repetition. So for each itteration I get 256 random samples from the train set. At the beginning I used with session:, construction, which released my object every time the training finished. Then I started using explicit close and reused the already trained model.
With experimentation I reached a level of 20k itterations, where both train and test costs are at thair lowest point. This means that the model generalizes well, it had good accuracy 99% and doesn't overfit.
I tested different 1,2,3 layers FFNN and CNNs with different number and sizes of filters and layers. The above one gave best accuracy performance, while still relatively fast to train.
Take several pictures of traffic signs that you find on the web or around you (at least five), and run them through your classifier on your computer to produce example results. The classifier might not recognize some local signs but it could prove interesting nonetheless.
You may find signnames.csv useful as it contains mappings from the class id (integer) to the actual sign name.
Use the code cell (or multiple code cells, if necessary) to implement the first step of your project. Once you have completed your implementation and are satisfied with the results, be sure to thoroughly answer the questions that follow.
### Load the images and plot them here.
### Feel free to use as many code cells as needed.
import glob
files = glob.glob("signs/*")
print(files)
import os
from scipy import misc
def load_and_resize(f):
image = misc.imread(f, mode="RGB")
image = misc.imresize(image, (32,32))
return image
images = np.array([load_and_resize(f) for f in files])
print(images.shape)
plt.figure(figsize=(14, 28))
for i in range(0, 18):
plt.subplot(6, 3, i+1)
plt.imshow(images[i, :, :, :])
Choose five candidate images of traffic signs and provide them in the report. Are there any particular qualities of the image(s) that might make classification difficult? It would be helpful to plot the images in the notebook.
Answer: I have plotted the images. Obviously they are very different from the train set even for naked eye. Some key differences: Train set images are all centerd and span to the full picture extend. Downloaded images' signs may take up small space of the pictures. Sometimes are positioned in corners. The contrast, brightness and colors of the train set and the downloaded set are obviously different. The downloaded images' signs are not perpendicular to the camera.
Also there is no NO STOPPING sign in the train/test set, and there is in my images! :(
### Run the predictions here.
### Feel free to use as many code cells as needed.
[image_predictions, image_logits] = sess.run([y_pred, logits], feed_dict={x: images, keep_prob: 1.0})
concreate_prediction = np.argmax(image_predictions, axis=1)
probability = np.max(image_predictions, axis=1)
print("predictions shape:", concreate_prediction.shape)
print("certainty:", probability)
print("concreate_prediction:", concreate_prediction)
plt.figure(figsize=(14, 28))
for i in range(0, 18):
plt.subplot(6, 3, i+1)
plt.imshow(images[i, :, :, :])
plt.title("Prediction: " + signnames.ix[concreate_prediction[i]]['SignName'])
average_image_acc = sum(np.argmax(image_predictions, 1)[:10] == images_cat) / 9
# one image is not in classes: NO STOPPING SIGN. So devide sum / 9, not 10
print("average accuracy for 9 images:", average_image_acc)
top_5_predictions = (-image_logits).argsort(1)[:10,:5]
print("top_5_predictions:\n", top_5_predictions)
top_true = [images_cat[i] in z for i, z in enumerate(top_5_predictions)]
print("top5 accurcy:", sum(top_true) / 9)
plot_img = 10
images_cat = [16, 17, 16, 13, 0, 18, 14, 18, 14, 25]
for i in range(plot_img):
plt.figure(figsize=(14,20))
plt.subplot(plot_img, 3, i*3 + 1)
plt.bar(list(range(n_classes)), image_predictions[i])
plt.title(signnames.ix[concreate_prediction[i]]['SignName'])
plt.subplot(plot_img, 3, i*3 + 2)
plt.bar(list(range(n_classes)), image_logits[i])
plt.title(signnames.ix[images_cat[i]]['SignName'])
plt.subplot(plot_img, 3, i*3 + 3)
plt.imshow(images[i, :, :, :])
plt.title("IN TOP 5" if top_true[i] else "NOT IN TOP 5")
plt.show()
Is your model able to perform equally well on captured pictures or a live camera stream when compared to testing on the dataset?
Answer: Definetly my model does not perform as well on the images downloaded and resized from the net. My model has 22% accuracy on 9 random images from internet. Still not so bad considering differences in images.
### Visualize the softmax probabilities here.
### Feel free to use as many code cells as needed.
Use the model's softmax probabilities to visualize the certainty of its predictions, tf.nn.top_k could prove helpful here. Which predictions is the model certain of? Uncertain? If the model was incorrect in its initial prediction, does the correct prediction appear in the top k? (k should be 5 at most)
Answer: Already showed the certainty in upper figure. The model is pretty certain for each classification. See the upper figure (on the left the softmax probability, in the middle the raw logits, and on the right the original image). Titles show predicted sign, real sign and is it in top 5 predictions. The model is even over certain. Using top 5 predictions from raw logits gave 55% accuracy for 9 images. Which is not bad at all.
If necessary, provide documentation for how an interface was built for your model to load and classify newly-acquired images.
Answer:
It is a method in the notebook load_and_resize(file_path)
Note: Once you have completed all of the code implementations and successfully answered each question above, you may finalize your work by exporting the iPython Notebook as an HTML document. You can do this by using the menu above and navigating to \n", "File -> Download as -> HTML (.html). Include the finished document along with this notebook as your submission.